A novel node splitting criterion in decision tree construction for semi-continuous HMMs
نویسندگان
چکیده
In [1], we described how to improve Semi-Continuous Density Hidden Markov Models (SC-HMMs) to be as fast as Continuous Density HMMs (CD-HMMs), whilst outperforming them on large vocabulary recognition tasks with context independent models. In this paper, we extend our work with SC-HMMs to context dependent modelling. We propose a novel node splitting criterion in an approach with phonetic decision trees. It is based on a distance measure between mixture gaussian probability density functions (pdfs) as used in the final tied state SC-HMMs, this in contrast with other criteria which are based on simplified pdfs to manage the algorithm complexity. Results on the ARPA Resource Management task show that the proposed criterion outperforms two of these criteria with simplified pdfs.
منابع مشابه
Fast and accurate acoustic modelling with semi-continuous HMMs
In this paper the design of accurate Semi-Continuous Density Hidden Markov Models (SC-HMMs) for acoustic modelling in large vocabulary continuous speech recognition is presented. Two methods are described to improve drastically the efficiency of the observation likelihood calculations for the SC-HMMs. First, reduced SC-HMMs are created, where each state does not share all the gaussian probabili...
متن کاملPredictive hidden Markov model selection for decision tree state tying
This paper presents a novel predictive information criterion (PIC) for hidden Markov model (HMM) selection. The PIC criterion is exploited to select the best HMMs, which provide the largest prediction information for generalization of future data. When the randomness of HMM parameters is expressed by a product of conjugate prior densities, the prediction information is derived without integral ...
متن کاملDecision Tree-Based Clustering with Outlier Detection for HMM-Based Speech Synthesis
In order to express natural prosodic variations in continuous speech, sophisticated speech units such as the contextdependent phone models are usually employed in HMM-based speech synthesis techniques. Since the training database cannot practically cover all possible context factors, decision treebased HMM states clustering is commonly applied. One of the serious problems in a decision tree-bas...
متن کاملFinding Multivariate Splits in Decision Trees Using Function Optimization
We present a new method for top-down induction of decision trees (TDIDT) with multivariate binary splits at the nodes. The primary contribution of this work is a new splitting criterion called soft entropy, which is continuous and differentiable with respect to the parameters of the splitting function. Using simple gradient descent to find multivariate splits and a novel pruning technique, our ...
متن کاملGlobal Tree Optimization: A Non-greedy Decision Tree Algorithm
A non-greedy approach for constructing globally optimal multivariate decision trees with xed structure is proposed. Previous greedy tree construction algorithms are locally optimal in that they optimize some splitting criterion at each decision node, typically one node at a time. In contrast, global tree optimization explicitly considers all decisions in the tree concurrently. An iterative line...
متن کامل